Multi-phone strings as subword units for speech recognition
نویسندگان
چکیده
The choice of speech unit affects the accuracy, complexity, expandability and ease of adaptation of ASRs to speaker and environmental variations. This paper explores a method of subword modelling based on the concept of multi-phone strings. The motivation in using the longer duration multi-phone strings is to reduce the loss of contextual information, cross-phone correlation, and transitions. Multi-phone strings are an alternative to context-dependent phones and they include many of the syllables. An advantage of mutiphone units is the existence of more than one valid multi-phone transcription for each monophone sequence, this can be used to improve ASR accuracy. A particular case of multi-phone strings namely phone-pairs is investigated in detail. Experimental Evaluation on TIMIT and WSJCAM0 are presented.
منابع مشابه
Constrained Subword Units for Speaker Recognition
Phonetic features have been proposed to overcome performance degradation in spectral speaker recognition in difficult acoustic conditions. The harmful effect of those conditions, however, is not restricted to spectral systems but also affects the performance of the open-loop phone recognisers on which phonetic systems are based. In automatic speech recognition, larger subword units and the use ...
متن کاملModelling Out-of-Vocabulary Words for Robust Speech Recognition
This thesis concerns the problem of unknown or out-of-vocabulary (OOV) words in continuous speech recognition. Most of today's state-of-the-art speech recognition systems can recognize only words that belong to some predefined finite word vocabulary. When encountering an OOV word, a speech recognizer erroneously substitutes the OOV word with a similarly sounding word from its vocabulary. Furthe...
متن کاملMulti-Scale Spoken Document Retrieval for Cantonese Broadcast News
This paper presents the application of a multi-scale paradigm to Chinese spoken document retrieval (SDR) for improving retrieval performance. Multi-scale refers to the use of both words and subwords for retrieval. Words are basic units in a language that carry lexical meaning and subword units (such as phonemes, syllables or characters) are building components for words. Retrieval using subword...
متن کاملAn utterance verification system based on subword modeling for a vocabulary independent speech recognition system
This paper describes a Korean utterance veri cation system based on subword modeling for a vocabulary independent speech recognition system. We deploy strategy consisting of two modules: recognition and veri cation, for utterance veri cation. In the stage of recognition, multiple hypotheses with hypothesized word boundaries obtained through Viterbi segmentation of the utterance are obtained. An...
متن کاملAn STD system for OOV query terms using various subword units
We have been proposing a Spoken Term Detection (STD) method for Out-Of-Vocabulary (OOV) query terms using various subword units, such as monophone, triphone, demiphone, one third phone, and Sub-phonetic segment (SPS) models. In the proposed method, subword-based ASR is performed for all spoken documents and subword recognition results are generated using subword acoustic models and subword lang...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1998